A Novel method for similarity analysis and protein sub-cellular localization prediction

نویسندگان

  • Bo Liao
  • Benyou Liao
  • Xingming Sun
  • Qingguang Zeng
چکیده

MOTIVATION Biological sequence was regarded as an important study by many biologists, because the sequence contains a large number of biological information, what is helpful for scientists' studies on biological cells, DNA and proteins. Currently, many researchers used the method based on protein sequences in function classification, sub-cellular location, structure and functional site prediction, including some machine-learning methods. The purpose of this article, is to find a new way of sequence analysis, but more simple and effective. RESULTS According to the nature of 64 genetic codes, we propose a simple and intuitive 2D graphical expression of protein sequences. And based on this expression we give a new Euclidean-distance method to compute the distance of different sequences for the analysis of sequence similarity. This approach contains more sequence information. A typical phylogenetic tree constructed based on this method proved the effectiveness of our approach. Finally, we use this sequence-similarity-analysis method to predict protein sub-cellular localization, in the two datasets commonly used. The results show that the method is reasonable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Novel 2D Graphic Representation of Protein Sequence and Its Application

In recent years, more and more researchers presented their graphic representations for protein sequence. Here, we present a new 2D graphic representation of protein sequence based on four physicochemical properties of 20 amino acids. By this graphic representation we define a distance calculation formula to quantificationally calculate the similarity degree of different protein sequences. Then ...

متن کامل

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks

Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...

متن کامل

Prediction of Protein Sub-cellular Localization using Information from Texts and Sequences

This paper presents a novel prediction approach for protein sub-cellular localization. We have incorporated text and sequence-based approaches.

متن کامل

Bioinformatic Analysis of L-Asparaginase II from Citrobacter Freundii 1101, Erwinia Chrysanthemi DSM 4610, E. coli BL21 and Klebsiella Pneumoniae ATCC 10031

Backgroung and Aims: L-Asparaginase II is a cornerstone of treatment protocols for acute lymphoblastic leukemia. Only asparaginase II obtained from E. coli K12 and Erwinia chrysanthemi have been used in human as therapeutic drug. The therapeutic effects of asparaginase II from E. coli K12 and Erwinia chrysanthemi is accompanied by side effects. It is desirable to search for other asparaginase I...

متن کامل

LOC3D: annotate sub-cellular localization for protein structures

LOC3D (http://cubic.bioc.columbia.edu/db/LOC3d/) is both a weekly-updated database and a web server for predictions of sub-cellular localization for eukaryotic proteins of known three-dimensional (3D) structure. Localization is predicted using four different methods: (i) PredictNLS, prediction of nuclear proteins through nuclear localization signals; (ii) LOChom, inferring localization through ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 26 21  شماره 

صفحات  -

تاریخ انتشار 2010